AITopics | non-stationary markov decision process

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing SystemsDec-25-2025, 16:20:26 GMT

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points.

model-based reinforcement learning, non-stationary markov decision process, worst-case approach, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Erwan Lecarpentier, Emmanuel Rachelson

Neural Information Processing SystemsOct-3-2025, 03:37:24 GMT

Our contribution can be presented in four points.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.97)

Add feedback

Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing SystemsJan-25-2025, 07:58:15 GMT

UPDATE: I have read the authors response and increased my score. Specifically, the authors fixed my understanding of Property 1 and properly framed the relaxation of the problem in Section 5. Please include similar clarifications in the final work. There was also a lot of discussion among the reviewers about how the paper relates to the Robust MDP literature, which needs to be covered better in the current work. Papers such as "Reinforcement Learning in Robust Markov Decision Processes" and "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" were brought up by others and seem applicable in the current setting and could be empirical competitors to RATS. I very much like the constraints used to study planning in non-stationary environments in this paper and the min-max inspired RATS algorithm seems like an appropriate game theoretic approach.

assumption, markov decision process, reinforcement learning, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing SystemsJan-25-2025, 07:58:04 GMT

The reviewers felt that this paper was well-executed, even though the proposed approach is a rather straightforward application of techniques from the robust MDP literature (specifically, minmax planning with appropriately defined uncertainty sets derived from a Lipschitzness assumption). For the final version, the authors should improve the discussion of related literature on robust MDPs (e.g., "Reinforcement Learning in Robust Markov Decision Processes" by Lim et al., NIPS 2013 references therein) and on MDPs with non-stationary transitions (e.g., "Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions" by Abbasi-Yadkori et al., NIPS 2013 references therein).

markov decision process, non-stationary markov decision process, reinforcement learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Neural Information Processing SystemsOct-10-2024, 10:30:26 GMT

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t.

model-based reinforcement learning, non-stationary markov decision process, worst-case approach, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)

Add feedback

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Lecarpentier, Erwan, Rachelson, Emmanuel

Neural Information Processing SystemsMar-18-2020, 23:31:31 GMT

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t.

model-based reinforcement learning, non-stationary markov decision process, worst-case approach, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)

Add feedback

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Lecarpentier, Erwan, Rachelson, Emmanuel

arXiv.org Machine LearningApr-22-2019

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously and its evolution rate is bounded, 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. First, we define this specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time. Secondly, we consider a planning agent using the current model of the environment, but unaware of its future evolution. This leads us to consider a worst-case method where the environment is seen as an adversarial agent. Third, following this approach, we propose the Risk-Averse Tree-Search (RATS) algorithm. This is a zero-shot Model-Based method similar to Minimax search. Finally, we illustrate the benefits brought by RATS empirically and compare its performance with reference Model-Based algorithms.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1904.1009

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Collaborating Authors

non-stationary markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Reviews: Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning